Picking the Fresh from the Rotten: Quote and Sentiment Extraction from Rotten Tomatoes Movie Reviews
نویسنده
چکیده
Being able to quickly determine whether a movie is worth watching is one of the major appeals of the movie review website Rotten Tomatoes (www.rottentomatoes.com). The site gathers reviews from known movie critics, with each critic's review assigned a binary rating expressing the review’s sentiment, fresh (positive) or rotten (negative), and a quote, the single sentence/phrase from the review that best exemplifies that rating. This form capsule summarization allows users the ability to discern the consensus of a group of critics at a glance. A system was constructed to perform the "Rotten Tomatoes Task" of identifying the quote and rating from an unlabeled review. A maximum entropy classifier was trained on a corpus of reviews from Rotten Tomatoes that had their ratings and accompanying quotes identified. Lexical and positional features were used. In addition, the notion of regional coherence, from text-segmentation work, was applied to create features that would help identify potential quotes from a review. A discussion of the nature of this "Rotten Tomatoes Task" is given, as well as future avenues of investigation, including alternative evaluation methods.
منابع مشابه
Predicting Sentiment from Rotten Tomatoes Movie Reviews
The aim of the project is to experiment with different machine learning algorithms to predict the sentiment of unseen reviews from the improved corpus that has the additional sentiment information of all sub-phrases. We use the Rotten Tomatoes movie review corpus1 that has been greatly improved upon, and annotated with a fine sentiment score via Amazon Mechanical Turk. We want to see whether ha...
متن کاملRotten Tomatoes: Sentiment Classification in Movie Reviews
Text classification plays an large part in today's world, where the amount of information available is overwhelming. Although most text classification work is related to topical classification, the categorization of more subjective documents that depend more on style and the author's opinion is also important. Websites such as Amazon, IMDB, and Rotten Tomatoes rely on opinions and reviews to ke...
متن کاملA Collaborative System for Sentiment Analysis
In the past we have witnessed our machine learning method for sentiment analysis coping well with figurative language, but determining with uncertainty the polarity of mildly figurative cases. We have shown that for these uncertain cases, a rule-based system should be consulted. We evaluate this collaborative approach on the ”Rotten Tomatoes” movie reviews dataset and compare it with other stat...
متن کاملExploring Sentiment Summarization
We introduce the idea of a sentiment summary, a single passage from a document that captures a key aspect of the author’s opinion about his or her subject. Using supervised data from the Rotten Tomatoes website, we examine features that appear to be helpful in locating a good summary sentence. These features are used to fit Naive Bayes and regularized logistic regression models for summary extr...
متن کاملSentiment Analysis using Recursive Neural Network
This work is based on [1] where the recursive autoencoder (RAE) is used to predict sentiment distributions. In this project, I compare the performance of several different tree building schemes and find that greedily merging nodes with minimal autoencoder error gives the best performance, which is better than using the correct parsing tree, among others. I then apply the recursive neural networ...
متن کامل